Image encoder, image decoder, image encoding method, and image decoding method

ABSTRACT

An input image signal  101  is divided into MC block units and, when coding processing is performed in these divided units, a motion compensation section  107  generates a motion-compensated prediction image  106   a  by detecting movement amounts in predetermined MC block units, a smoothing filter section  124  performs, with respect to the prediction image  106   a , smoothing of pixels located at the boundaries of adjoining MC blocks on the basis of predetermined evaluation criteria, and a prediction residual signal  108 , which is obtained from the difference between the prediction image  106   b  obtained by the smoothing, and the input image (input image signal  101 ), is encoded. Accordingly, it is possible to use relatively straightforward computation to perform processing, with respect to a prediction frame image generated by block-unit motion-compensated interframe prediction (MC), to adaptively smooth a discontinuous waveform generated between MC blocks of the prediction frame image, whereby the efficiency of low bit rate coding that employs interframe MC is improved.

TECHNICAL FIELD

The present invention relates to an image coding apparatus, an imagedecoding apparatus, an image coding method, and an image decoding methodthat perform the transmission and storage of images with a smallencoding data volume and that are applied to a mobile image transmissionsystem or similar.

BACKGROUND ART

Conventional image coding systems are constituted to divide image framesinto blocks of a fixed size and then perform coding processing in thesedivided units. Typical examples of conventional image coding systemsinclude the MPEG (Moving Picture Experts Group) 1 coding system asdescribed in Le Gall. D: “MPEG: A Video Compression Standard forMultimedia Applications”, Trans. ACM, 1991, April.

MPEG 1 performs motion-compensated interframe prediction (MC: MotionCompensation) by dividing image frames into fixed block units known asmacroblocks, detecting movement amounts (or motion vectors) byreferencing a local decoding frame image encoded in units, specifyingsimilar blocks from within a reference image and employing these similarblocks as predictive data. By means of this technique, even when thereis motion in an image, the prediction efficiency can be improved bytracking the motion, and redundancy in a temporal direction can bereduced. Furthermore, redundancy that remains in a spatial direction canbe reduced by employing the DCT (Discrete Cosine Transform), using unitsthat are blocks consisting of 8×8 pixels, with respect to a predictionresidual signal. A variety of standard image coding systems that startwith MPEG1 perform data compression of image signals by combining MC andthe DCT.

FIG. 20 is a block diagram showing the constitution of a conventionalimage coding apparatus based on an MPEG1 image coding system. An inputimage signal 1 which is inputted to the image coding apparatus shown inFIG. 20 is a temporal array of frame images and will subsequently embodythe signal of each frame image unit. Further, an example of a frameimage that is to be encoded is shown in FIG. 21. The current frame 601is divided into fixed square/rectangular regions of 16 pixels×16 lines(called macroblocks), and the processing that follows is performed inthese units.

The macroblock data of the current frame (current macroblocks) producedby the input image signal 1 are first outputted to a motion detectionsection 2 where detection of motion vectors 5 is carried out. A motionvector 5 is detected by referencing a predetermined search region of aprevious encoded frame image 4 (called a local decoding image 4hereinafter) stored in a frame memory 3, locating a pattern similar tothe current macroblock (called a prediction image 6 hereinafter), anddetermining the amount of spatial displacement between this pattern andthe current macroblock.

Here, the local decoding image 4 is not limited to only the previousframe. Rather, a future frame can also be used as a result of beingencoded in advance and stored in the frame memory 3. Although the use ofa future frame generates switching of the coding order and in turn anincrease in the processing delay, there is the merit that variations inthe image content produced between previous and future frames is easilypredicted, thus making it possible to effectively reduce temporalredundancy still further.

Generally, in MPEG1, it is possible to selectively use three codingtypes which are called bidirectional prediction (B frame prediction),forward prediction (P frame prediction) that uses previous frames alone,and I frame prediction which does not perform interframe prediction,instead performing coding only within the frame. FIG. 21 is limited to Pframe prediction alone, a local decoding image 4 being recorded with aprevious frame 602.

The motion vector 5 shown in FIG. 20 is rendered by a two-dimensionalparallel displacement amount. Block matching as represented in FIGS. 22Ato 22D is generally used as the method of detecting the motion vector 5.A motion search range 603 centered on the spatial phase of the currentmacroblock is established, then, from the image data 604 within themotion search range 603 of the previous frame 602, the block for whichthe sum of the squares of the differences or the sum of the absolutevalues of differences is minimum is determined as the motion predictivedata, and the relocation amount of the motion predictive data inrelation to the current macroblock is determined as the motion vector 5.

The motion predictive data for all the macroblocks in the current frameis determined, and this data, which is rendered as a frame image, isequivalent to the motion prediction frame 605 in FIG. 21. The difference606 between the motion prediction frame 605 shown in FIG. 21 obtained byway of the above MC processing, and the current frame 601, is obtained(obtained by the subtraction section 21 shown in FIG. 20), and thisresidual signal (called the prediction residual signal 8 hereinafter)undergoes DCT coding. Specifically, the processing to extract the motionpredictive data for every macroblock (prediction image 6 hereinafter) isperformed by a motion compensation section 7. The processing performedby the motion compensation section 7 involves using a motion vector 5 toextract the prediction image 6 from the local decoding image 4 stored inthe frame memory 3.

The prediction residual signal 8 is converted into DCT coefficient data10 (also called DCT coefficients 10 hereinafter) by a DCT section 9. Asshown in FIG. 23, the DCT converts spatial pixel vectors denoted by 610into a set of normalized orthogonal bases that render fixed frequencycomponents denoted by 611. 8×8 pixel blocks (‘DCT blocks’ below) arenormally adopted for the spatial pixel vectors. Because the DCT isdiscrete transform processing, the DCT actually performs conversion foreach of the horizontal and vertical 8 dimensional row and column vectorsof the DCT block.

The DCT uses the correlation between pixels present in a spatial regionto localize the power concentration in the DCT block. The higher thepower concentration, the better the conversion efficiency is, andtherefore the performance of the DCT with respect to a natural imagesignal is not inferior when compared with a KL transform which is theoptimum transform. Particularly in the case of a natural image, thepower is concentrated in the lower regions including the DC component asa main part, and there is barely any power in the higher regions, andtherefore, as shown in FIG. 24, by scanning from the lower regions tothe higher regions as indicated by the arrows in the DCT block such thatthe quantized coefficients denoted by 612 are denoted by 613, and byincluding a large zero run, the overall coding efficiency which alsoincludes the results of entropy coding is improved.

The quantization of the DCT coefficients 10 is performed by aquantization section 11 and the quantized coefficients 12 obtainedthereby are scanned, run-length encoded, and multiplexed in a compressedstream 14 by a variable length coding section 13 before beingtransmitted. Further, the motion vectors 5 detected by the motiondetection section 2 are also multiplexed in the compressed stream 14 andtransmitted, one macroblock at a time, because these vectors arerequired in order to allow the image decoding apparatus describedsubsequently to generate a prediction image that is the same as that ofthe image coding apparatus.

In addition, the quantized coefficients 12 are decoded locally via areverse quantization section 15 and a reverse DCT section 16, and thedecoded results are added to the prediction image 6 by an additionsection 22, whereby a decoding image 17 which is the same as that of theimage decoding apparatus is generated. The decoding image 17 is used inthe prediction for the next frame and is therefore stored in the framememory 3.

A description is provided next for the constitution of a conventionalimage decoding apparatus that is based on an MPEG1 image decoding systemas shown in FIG. 25. After receiving the compressed stream 14, the imagedecoding apparatus detects a sync word indicating the start of eachframe by means of a variable length decoding section 18, andsubsequently decodes motion vectors 5 and quantized DCT coefficients 12in macroblock units. The motion vectors 5 are outputted to a motioncompensation section 7 d and the motion compensation section 7 dextracts, as a prediction image 6, the image parts which have moved toan extent equivalent to the motion vectors 5, from a frame memory 19(used in the same way as the frame memory 3), this operation beingsimilar to the operation of the above-mentioned image coding apparatus.The quantized DCT coefficients 12 are decoded via a reverse quantizationsection 15 d and a reverse DCT section 16 d, and then added by theaddition section 23 to the prediction image 6 to form the final decodingimage 17. The decoding image 17 is outputted using predetermined displaytiming to a display device (not shown) where the image is played back.

DISCLOSURE OF THE INVENTION

However, in a conventional apparatus, MC performs movement amountdetection based on the premise that all of the pixels in the blocks(referred to as MC blocks hereinafter and as macroblocks in the MPEG1example above) which are the units of MC possess the same motion.Consequently, the possibility exists that, with a prediction image thatis constituted by the spatial disposition of MC blocks, a signalwaveform will arise in which discontinuity is perceived at theboundaries of the MC blocks. This discontinuous waveform can becompensated by supplementing the residual component in cases where anadequate encoding data amount is allocated to the residual signal.However, when coding is carried out with a high compression ratio, asatisfactory rendition of the residual signal is not possible and thediscontinuous boundaries are sometimes apparent and perceived asdistortion.

Further, it has been identified that, because the DCT is also a closedorthogonal transform in fixed blocks, in cases where the transform basiscoefficients are reduced as a result of coarse quantization, the signalwaveform which naturally connects between blocks cannot be reconstitutedand unnatural distortion is generated between blocks (block distortion).

As means for solving the former MC block boundary discontinuity,overlapped motion compensation (called OBMC hereinafter) has beenproposed. As illustrated by FIGS. 26A and 26B, OBMC is a techniqueaccording to which predictive data specified by the MC blocks' ownmotion vectors is added weighted with predictive data specified bymotion vectors possessed by neighboring MC blocks, whereby the finalpredictive data is obtained.

In FIG. 26A, frame F(t) uses frame F(t−1) as a reference image andextracts predictive data in units of MC blocks (A to E, for example)from within the reference image. Normal MC employs this predictive dataas is but OBMC extracts predictive data that corresponds to the positionof block C by using the motion vectors MV(A), MV(B), MV(D), and MV(E) ofthe neighboring blocks A, B, D, and E shown in FIG. 26B, in determiningthe prediction image Pc of block C. In this extraction, P{C,MV(A)}signifies processing to extract predictive data for the position of C byusing MV(A). The extracted predictive data are added weighted by W1 toW5 as per the following formula.Pc=W1×P{C, MV(C)}+W2×P{C, MV(A)}+W3×P{C, MV(B)}+W4×P{C, MV(D)}+W5×P{C,MV(E)}

Here, the weight is normally set so that the influence of the originalpredictive data of block C becomes gradually smaller when moving fromthe center of the block C toward the block boundaries. Such processingaffords the benefit that, because the prediction image is determinedsuch that the movement amounts of neighboring regions overlap block C'sown motion, the continuity of the waveform is preserved between theinner and outer pixels of the MC blocks and thus the boundaries thereofdo not readily stand out.

However, with OBMC, there is the problem that, in addition to theextraction of predictive data for a block's own motion vectors,processing is also executed for all the MC blocks which includesprocessing to extract predictive data by means of motion vectors ofneighboring MC blocks, and processing that involves the weightedaddition of such data, meaning that the computational load is high.

Furthermore, in the movement amount detection involved in image coding,because detection is performed based on the criterion that the power ofthe prediction residual should be minimized beyond the movement amountthat matches the natural movement of the subject, the problems existthat motion which is not based on true movement is sometimes detected inregions containing a lot of noise or in other locations, and that, insuch locations, MC blocks are smoothed beyond what is necessary throughthe combined influence of neighboring movement amounts in OBMC, andtwo-line blurring is generated, and so forth.

On the other hand, as means for solving the latter DCT block distortion,a loop filter has been proposed. The loop filter acts as a smoothingfilter for the boundaries of the DCT blocks of a decoding image that isobtained by adding a prediction residual signal, which has undergoneencoding and local decoding, to a prediction image. This is a techniquethat does not introduce the effects of distortion, as caused by DCTquantization, to MC by removing block distortion from the referenceimage which is used for subsequent frames. However, so long as MC islimited to being performed in block units, the discontinuity between MCblocks will not necessarily be avoided. Further, there is the problemthat in cases where residual coding which is not dependent on blockstructure such as subband coding or a block-spanning basis transform, orthe like, is performed, coding efficiency disadvantages caused by theexistence of a discontinuous waveform at block boundaries cannot beavoided.

The present invention was conceived in view of these problems, an objectthereof being to provide an image coding apparatus, image decodingapparatus, image coding method, and image decoding method that make itpossible to use relatively simple computation to perform processing,with respect to a prediction frame image generated by block-basedmotion-compensated interframe prediction (MC), to adaptively smooth adiscontinuous waveform generated between MC blocks of the predictionframe image, whereby the efficiency of low bit rate coding that employsinterframe MC can be improved.

In order to resolve the above problems, the image coding apparatusaccording to the present invention is characterized by comprising:motion compensation predicting means for generating a motion-compensatedprediction image by detecting movement amounts in predetermined partialimage region units of an input image; smoothing means for performingsmoothing of pixels located at the boundaries of adjoining partial imageregions on the basis of predetermined evaluation criteria, with respectto the prediction image obtained by the motion compensation predictingmeans; and prediction residual coding means for coding a predictionresidual signal obtained from the difference between the input image andthe smoothed prediction image.

Further, as the image decoding apparatus which corresponds to this imagecoding apparatus, the image decoding apparatus according to the presentinvention is characterized by comprising: motion compensation predictingmeans for generating a motion-compensated prediction image by detectingmovement amounts in predetermined partial image region units; smoothingmeans for performing smoothing of pixels located at the boundaries ofadjoining partial image regions on the basis of predetermined evaluationcriteria, with respect to the prediction image obtained by the motioncompensation predicting means; prediction residual decoding means fordecoding a prediction residual signal from the encoding side; and addingmeans for obtaining a decoded image by adding together a decodedprediction residual signal obtained by the prediction residual decodingmeans, and the smoothed prediction image.

In addition, in order to resolve the above problems, the image codingmethod according to the present invention is characterized bycomprising: a motion compensation predicting step of generating amotion-compensated prediction image by detecting movement amounts inpredetermined partial image region units of an input image; a smoothingstep of performing smoothing of pixels located at the boundaries ofadjoining partial image regions on the basis of predetermined evaluationcriteria, with respect to the prediction image obtained by the motioncompensation predicting step; and a prediction residual coding step ofcoding the prediction residual signal obtained from the differencebetween the input image and the smoothed prediction image.

Further, as the image decoding method which corresponds to this imagecoding method, the image decoding method according to the presentinvention is characterized by comprising: a motion compensationpredicting step of generating a motion-compensated prediction image bydetecting movement amounts in predetermined partial image region units;a smoothing step of performing smoothing of pixels located at theboundaries of adjoining partial image regions on the basis ofpredetermined evaluation criteria, with respect to the prediction imageobtained by the motion compensation predicting step; a predictionresidual decoding step of decoding a prediction residual signal from theencoding side; and an adding step of obtaining a decoded image by addingtogether a decoded prediction residual signal obtained by the predictionresidual decoding step, and the smoothed prediction image.

According to this constitution, smoothing is performed for pixelslocated at the boundaries of adjoining partial image regions on thebasis of predetermined evaluation criteria, with respect to theprediction image, and it is therefore possible to perform correction inthe direction in which only the smoothing processing, that correctsdiscontinuity in partial image regions, is allowed. It is thereforepossible to improve the coding efficiency by suppressing discontinuouswaveforms generated in the prediction residual. Accordingly, it ispossible to use relatively straightforward computation to performprocessing, with respect to a prediction frame image generated byblock-unit motion-compensated interframe prediction (MC), to adaptivelysmooth a discontinuous waveform generated between MC blocks of theprediction frame image, whereby the efficiency of low bit rate codingthat employs interframe MC can be improved.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram showing the constitution of the image codingapparatus according to a first embodiment of the present invention;

FIG. 2 is a block diagram showing the constitution of the image decodingapparatus according to the first embodiment;

FIGS. 3A to 3G show the form of MC (motion-compensated interframeprediction) blocks;

FIG. 4 serves to illustrate the problems posed with block unit MC;

FIG. 5 is a block diagram showing the constitution of a smoothing filtersection of the first embodiment;

FIG. 6 is a flowchart serving to illustrate the computational processingoperation of the block activity level calculation section of thesmoothing filter section;

FIG. 7 shows an example of a unit for determining the block activitylevel;

FIG. 8 is a flowchart to illustrate the operation of the processing tocorrect futile filter processing in the smoothing filtering performed bythe filter processing section of the smoothing filter section;

FIG. 9 shows the appearance of pixels for processing which are betweenblocks that are adjoined in a lateral direction, in smoothing filterprocessing;

FIGS. 10A and 10B show the appearance of pixels for processing which arebetween blocks that are adjoined in a lateral direction, in smoothingfilter processing performed by another filter;

FIG. 11 shows the function established in the post-processing section ofthe smoothing filter section;

FIG. 12 is a block diagram showing the constitution of the image codingapparatus according to a second embodiment of the present invention;

FIG. 13 is a block diagram showing the constitution of the imagedecoding apparatus according to the second embodiment;

FIG. 14 serves to illustrate the definition, for block boundaries, of anactivity level, in the smoothing filter section according to a thirdembodiment of the present invention;

FIGS. 15A and 15B serve to illustrate a case where a 5-tap filter, whichhas the pixel to be filtered at the center and which uses two pixels tothe left and right thereof respectively, is employed in the smoothingfilter section according to a fourth embodiment of the presentinvention;

FIG. 16 is a block diagram showing the constitution of the image codingapparatus according to a fifth embodiment of the present invention;

FIG. 17 is a block diagram showing the constitution of the imagedecoding apparatus according to the fifth embodiment;

FIG. 18 is a block diagram showing the constitution of the smoothingfilter section according to the fifth embodiment;

FIG. 19 is a flowchart to illustrate the operation of the block activitylevel calculation section in the smoothing filter section of the fifthembodiment;

FIG. 20 is a block diagram showing the constitution of a conventionalimage coding apparatus based on an MPEG1 image coding system;

FIG. 21 is a conceptual view of motion-compensated frame prediction;

FIGS. 22A to 22D are conceptual views of motion compensation by means ofblock matching;

FIG. 23 is a conceptual view of the Discrete Cosine Transform;

FIG. 24 is an illustrative view of quantization and run length encoding;

FIG. 25 is a block diagram showing the constitution of a conventionalimage decoding apparatus based on an MPEG1 image decoding system; and

FIGS. 26A and 26B are illustrative views of OBMC (overlapped motioncompensation).

BEST MODES FOR CARRYING OUT THE INVENTION

Embodiments of the present invention will be described in detailhereinbelow by referring to the drawings.

First Embodiment

FIG. 1 is a block diagram showing the constitution of the image codingapparatus according to the first embodiment of the present invention,and FIG. 2 is a block diagram showing the constitution of the imagedecoding apparatus. The image coding apparatus shown in FIG. 1 performsefficient image coding on account of being constituted to reduceredundancy in a temporal direction by means of MC, quantify spatialdiscontinuity between individual MC blocks with respect to theprediction image obtained by means of the MC, and adaptively performsmoothing filter processing in accordance with the conditions.

The MC procedure of this image coding apparatus is substantially thesame as the method described in the conventional example. An outline ofthis procedure is provided in FIG. 21 and an outline of the blockmatching processing employed in the motion vector detection is as shownin FIGS. 22A to 22D. However, the MC blocks can be defined by uniformlydivided units into which macroblocks are divided into a variety ofrectangular regions as shown by MC modes 1 to 7 in FIGS. 3A to 3G, andidentification data indicating which MC block shape is used istransmitted as coding mode data.

For example, in MC mode 1 shown in FIG. 3A, in order to make amacroblock into an MC block, one motion vector is established for themacroblock. On the other hand, in MC mode 2 shown in FIG. 3B, regionsproduced by dividing the macroblock into lateral halves are MC blocks,meaning that two motion vectors per macroblock are established.Similarly, in MC mode 7 shown in FIG. 3G, 16 motion vectors permacroblock are established.

A procedure that involves performing an orthogonal transform withrespect to a residual signal obtained from the difference between aninput image and a prediction image which has undergone smoothingprocessing, and then quantizing and entropy coding the correspondingcoefficients, is also as described with reference to FIG. 24 in theconventional example.

The operation of the image coding apparatus and image decoding apparatusshown in FIGS. 1 and 2 respectively will be described below withreference to the drawings and will be centered on the smoothing filterprocessing which is a particular feature of the present embodiment.

The operation of the image coding apparatus will be described first. Theinput image signal 101 is a temporal array of frame images and willsubsequently embody the signal of a frame image unit. A frame image thatis to be encoded is the current frame 601 shown in FIG. 21.

The current frame is encoded by means of the following procedure. Theinput image signal 101 is inputted to a motion detection section 102 onemacroblock at a time, and motion vectors 105 are detected in the motiondetection section 102. Of the macroblock forms shown in FIGS. 3A to 3G,the form which affords the best coding efficiency is selected for the MCblock which is the motion-vector assigned unit. The motion compensationsection 107 uses the motion vectors 105 to reference a reference image104 (encoded and locally decoded frame image) which is stored in theframe memory 103 and then extracts a prediction image 106 a for themacroblocks.

Although the motion detection section 102 and the motion compensationsection 107 perform processing for every one of the macroblocks, thesignal for the difference with respect to the input image signal 101(the prediction residual signal 108) is obtained with the frame as theunit. That is, the motion vectors 105 of individual macroblocks aremaintained over the entire frame, whereby the prediction image 106 a isconstituted as a frame-unit image.

Next, smoothing processing between MC blocks of the prediction image 106a is performed in the smoothing filter section 124. The details of thisprocessing will be described in detail subsequently. The smoothedprediction image 106 b is subtracted from the input image signal 101 bythe subtraction section 131, and, as a result, the prediction residualsignal 108 is obtained. The prediction residual signal 108 is convertedinto orthogonal transform coefficient data 110 by the orthogonaltransform section 109. A DCT is used for example in the orthogonaltransform. The orthogonal transform coefficient data 110 passes througha quantization section 111, and is scanned and run-length encoded by thevariable length coding section 113, before being multiplexed andtransmitted in a compressed stream 114 by same.

Thereupon, coding mode data 123 that indicates whether intraframe codingor interframe coding has been performed, which is determined onemacroblock at a time, is also multiplexed. In an inter mode case, motionvectors 105 are multiplexed and transmitted in a compressed stream 114one macroblock at a time. Further, quantized coefficients 112 arelocally decoded via a reverse quantization section 115 and a reverseorthogonal transform section 116, and the decoded result is added to theprediction image 106 b by an addition section 132 to thereby generate adecoding image 117 which is the same as that on the image decodingapparatus side. The decoding image 117 is stored in the frame memory 103to be used as a reference image 104 for the prediction of the nextframe.

Next, the operation of the image decoding apparatus will be describedwith reference to FIG. 2. After the compressed stream 114 has beenreceived by the image decoding apparatus, the variable length decodingsection 118 detects a sync word that represents the start of each frame,whereupon coding mode data 123, motion vectors 105 and quantizedorthogonal transform coefficients 112 are decoded in macroblock units.The motion vectors 105 are outputted to the motion compensation section107, and the motion compensation section 107 extracts, as the predictionimage 106 a, the image parts which have moved to an extent equivalent tothe motion vectors 105 from a frame memory 122 (used in the same way asthe frame memory 103), this operation being similar to the operation ofthe image coding apparatus.

The prediction image 106 a passes through the smoothing filter section124 and is then outputted as the smoothed prediction image 106 b. Thequantized orthogonal transform coefficients 112 are decoded via areverse quantization section 120 and a reverse orthogonal transformsection 121, and then added by an addition section 133 to the predictionimage 106 b to form the final decoded image 117. The decoded image 117is stored in the frame memory 122 and outputted to a display device (notshown) with predetermined display timing, whereby the image is playedback.

Next, the operation of the smoothing filter section 124 will bedescribed. First, the grounds for the need for the smoothing filter willbe described with reference to FIG. 4. FIG. 4 shows an input image 141,a prediction image 142 in which no discontinuity between MC blocks isgenerated as well as a prediction residual image 143 which uses theprediction image 142, and a prediction image 144 in which discontinuitybetween MC blocks is generated as well as a prediction residual image145 which uses the prediction image 144. Motion vectors are detectedwith MC blocks as units by means of a motion detection algorithm of ageneral image coding system such a block matching algorithm or the like.

That is, all the pixels contained the MC blocks possess the samemovement amount. Generally, in block unit MC, motion vectors whichafford the greatest reduction in the prediction residual for the MCblocks are detected, meaning that no consideration is given to spatialcontinuity with adjoining MC blocks. For this reason, as shown in FIG.4, a discontinuous waveform 146 is sometimes produced between MC blocks.This discontinuous waveform 146 remains in the residual signal and is asignal subjected to encoding. Here, in order to exercise caution, theorthogonal transform itself is performed only within the above MC blockand therefore has no influence on the coding of the frame.

However, in cases where, in the prediction residual coding of thisframe, such a specific waveform cannot be adequately encoded, thewaveform component remains in the local decoding image and appearswithin the MC blocks in the prediction image of subsequent frames. Thissometimes influences the coding efficiency of the prediction residualsignal. In a natural image, the boundaries of the original MC blocksshould be smoothly linked, and, based on this assumption, the processingperformed by the smoothing filter section 124 has as an object to obtaina prediction image which is close to being a natural image by smoothingany discontinuous waveforms present between MC blocks.

The constitution of the smoothing filter section 124 is shown in FIG. 5and will now be described. First, a block activity level calculationsection 125 determines a block activity level S(X) in units of fixedblocks X in the prediction image 106 a. The flow of the processing bythe block activity level calculation section 125 is shown in FIG. 6. Thedetermination of S(X) is based on the relationship with the neighboringblocks as shown in FIG. 7. Here, the blocks A to E are units fordetermining the block activity level and are not necessarily the same asthe MC blocks. For example, the fact that block A is one part of alarger MC block 147 is shown. In other words, the block activity levelis directed toward blocks of fixed size that bear no relation to thesize of the MC blocks shown in FIGS. 3A to 3G. First, S(X) is set to apredetermined initial value S0 (zero, for example) over the whole areaof the frame. Then, when the coding mode data 123 of macroblocksincluding block C indicate the intra mode, the block activity level S(X)is determined in accordance with the following Rule 1 (step ST1).

(Rule 1)

-   (1) current S(A) is updated to max {S(A), S0+1}-   (2) current S(B) is updated to max {S(B), S0+1}-   (3) current S(C) is updated to max {S(C), S0+2}-   (4) current S(D) is updated to max {S(D), S0+1}-   (5) current S(E) is updated to max {S(E), S0+1}

Therefore, the block activity level in the vicinity of the block to beintra coded is set high. The resolution of the prediction image in intracoding is generally lower than that of a prediction image produced byinter coding and hence the boundaries of blocks among macroblocks ofintra mode stand out easily. The provision of step ST1 is equivalent toraising the priority of smoothing processing in such regions.

A description will be provided next for the rule for setting S(X) whenthe coding mode data 123 of macroblocks including block C indicatesinter coding. First, it is judged whether or not the current predictionimage was generated using bidirectional prediction (the B frameprediction mentioned in the conventional example) (step ST2).

In cases where bidirectional prediction can be used, the predictiondirection can be changed for each of the macroblocks. When theprediction direction differs between blocks, spatial continuity at theboundaries of both blocks cannot be assumed. That is, a judgment is madeof whether or not the prediction direction of blocks A, B, D, and E,which adjoin block C, is the same, and processing is then switched (stepST3).

When only unidirectional prediction is used or when the frame permitsbidirectional prediction and the prediction direction for block C is thesame, the block activity level is updated in accordance with Rule 2below (step ST4).

(Rule 2)

-   (1) If macroblocks including block A are of an inter mode, current    S(A) is updated to max {S(A), K} and    -   current S(C) is updated to max {S(C), K}.

Here,

-   -   K=2 (when mvd(A,C)≧3)    -   K=1 (when 0<mvd(A,C)<3)    -   K=0 (when mvd(A,C)=0)

-   (2) If macroblocks including block B are of an inter mode, current    S(B) is updated to max {S(B), K} and    -   current S(C) is updated to max {S(C), K}.

Here,

-   -   K=2 (when mvd(B,C)≧3)    -   K=1 (when 0<mvd(B,C)<3)    -   K=0 (when mvd(B,C)=0)

-   (3) If macroblocks including block D are of an inter mode, current    S(D) is updated to max {S(D), K} and    -   current S(C) is updated to max {S(C), K}.    -   Here,    -   K=2 (when mvd(D,C)≧3)    -   K=1 (when 0<mvd(D,C)<3)    -   K=0 (when mvd(D,C)=0)

-   (4) If macroblocks including block E are of an inter mode, current    S(E) is updated to max {S(E), K} and    -   current S(C) is updated to max {S(C), K}.

Here,

-   -   K=2 (when mvd(E,C)≧3)    -   K=1 (when 0<mvd(E,C)<3)    -   K=0 (when mvd(E,C)=0)

-   (5) If blocks A, B, D, and E are intra coded, the block activity    level thereof is not changed.

In the above rule, mvd (X,Y) indicates the large value of thedifferential values for the components of motion vectors of adjoiningblocks X and Y. Further, max (a,b) indicates the larger value of thevalues a and b. By updating the above block activity level, a high blockactivity level can be provided between blocks exhibiting a marked motionvector difference.

When mvd (X, Y)=0 (when there is no motion vector difference betweenblocks X and Y), this represents a case where the block boundary retainscomplete spatial continuity and there is thus no need for smoothinghere. The block activity level is therefore set to a minimum value.

On the other hand, in a frame permitting the use of bidirectionalprediction and when the direction of the prediction for the blocks A, B,D, E is different in relation to block C, or in a mode that combinesprediction images by adding and averaging the prediction values inforward and backward directions, the spatial continuity of theprediction image is broken irrespective of the motion vector difference,and hence the current S(X) is updated (step ST5) to max {S(X),1} (Xrepresents blocks A to E). The above processing is performed untilcompleted for all the fixed blocks X in the frame (step ST6), and thesetting of the block activity level S(X) 126 is thus completed.

By using the block activity level S(X) 126 set by the block activitylevel calculation section 125, smoothing processing between MC blocks isperformed for the prediction image 106 a in the filter processingsection 127. In the filter processing process, futile filter processingis corrected in the post-processing section 129 so that the results 128of performing smoothing filtering once do not produce excessivesmoothing. The process for this processing is shown in the flowchart ofFIG. 8, and the state of pixels to be processed between laterallyadjoining blocks is shown in FIG. 9 and will now be described.

That is, in FIG. 9, the pixels r₁ to r₃ are contained in block n and thepixels l₁ to l₃ are contained in block n−1 which adjoins block n on theleft. In the description that follows, the assumption is made that blockn and block n−1 possess mutually different vectors MV (n) and MV (n−1)respectively and that an MC block boundary BD lies between r₁ and l₁.The definition is also the same for pixels to be processed betweenvertically adjoining blocks.

First, it is judged whether or not the magnitude of the block boundarydifference value d=|r₁−l₁| (here, r₁, l₁ represent the pixel values ofthe respective pixels r₁, l₁) exceeds a threshold value α(S) establishedin accordance with the block activity level S (step ST7). In the filterprocessing below, processing is performed for two block boundaries andtherefore, the larger value of the values of S(X) for the two blocksprocessed is used as the block activity level S. For example, whenfiltering is performed on the boundaries of block B and block C in FIG.7, if S(B)>S(C), the value of S(B) is used as the block activity levelS. When the differential value d is equal to or less than the thresholdvalue α(S), filter processing is not performed in the pixel regions ofr₁ to r₃, and l₁ to l₃. On the other hand, when the differential value dexceeds the threshold value α(S), filter processing is performed byswitching the pixel region to be filtered, in accordance with S (stepST8).

As a result, if S=0, there is no discontinuity at the block boundariesand so the filter processing is skipped. If S=1, filter processing isperformed for the two pixels which are pixels r₁ and l₁ (step ST9). Asshown in FIG. 9, for example, there are methods including one where thefilter processing employs a low pass filter F constituted to use threepoints, which are the pixels r₁, l₁, and r₂, for pixel r₁, and to usethree points, which are the pixels l₁, l₂, and r₁, for pixel l₁, but anygiven filter can be used. For example, a 5-tap filter, which has thepixels r₁ or l₁ at the center and which uses two pixels to the left andright (or above and below) thereof respectively, may be employed. As afurther example of a filter, constitutions may also be considered suchas a constitution in which, as shown in FIG. 10A, when pixel l₁ isfiltered, pixel r₁ and prediction pixel value lr₁ of the positions ofpixel l₁ which is extracted by means of the vector MV (n) of pixel r₁,is employed, and, as shown in FIG. 10B, when pixel r₁ is filtered, pixell₁ and prediction pixel value rl₁ of the positions of r₁ which isextracted by means of the vector MV (n−1) of pixel l₁, is employed. Theprediction pixel value lr₁ and rl₁ are pixel values which are spatiallylinked from the start to the pixels r₁ and l₁ in a reference imageregion in the frame memory, which makes more natural smoothing of blockboundaries possible.

When S=2, in addition to the pixels r₁ and l₁ pixels r₂ and l₂ arepixels that are targeted for smoothing (steps ST10 and ST11). In caseswhere S=2, there are often steep and discontinuous boundaries due to thehigh block activity level, and hence the object is to increase thecontinuity of the signal by increasing the extent of the smoothing.

The above processing is carried out in a filter processing section 127.The prediction pixel value 128 produced by filter processing iscorrected so as to be effective in the coding efficiency of apost-processing section 129. The processing of the post-processingsection 129 is equivalent to steps ST12 and ST13 in FIG. 8. Thepost-processing section 129 controls the differential value Δ betweenthe pixel value before filtering and the pixel value after filtering bymeans of a threshold value Th(S) established in accordance with theblock activity level S.

Specifically, the functions shown in FIG. 11 (horizontal axis: Δ,vertical axis: Δ correction value) are established and the correctionvalue is determined. Here, the threshold value Th(S) is the maximumpermissible differential value, and in cases where a Δ that is equal toor more than this value is produced, correction is applied in thedifference-diminishing direction in accordance with the size of thisvalue. In cases where Δ is equal to or more than the threshold valueTh(S), the assumption is made that the difference obtained by filteringis not attributable to MC block discontinuity but instead is the resultof filtering with respect to an edge component that was originallypresent in the image.

Therefore, according to the image coding apparatus and image decodingapparatus according to the first embodiment, it is possible to performcorrection in the direction in which only filtering to correct MC blockdiscontinuity is allowed, by means of the above corrective measuresperformed by the smoothing filter section 124. The prediction image 106b is outputted via the above processing and it is therefore possible toimprove the coding efficiency by suppressing discontinuous waveformsgenerated in the prediction residual.

When setting the block activity level in FIG. 7, in inter mode, theblock activity level is determined up to correspond to the range ofvalues of mvd (X, Y), but the method used to determine this range isoptional. More particularly, the block activity level in an inter modecase may be determined using only the criterion that mvd (X, Y) shouldbe zero or not zero. Further, since it can be said that the smaller themotion vector-assigned units in the variety of MC block forms shown inFIGS. 3A to 3G become and the larger the number of motion vectors permacroblock becomes, the more intense the motion in the macroblock and inthe vicinity thereof becomes, the block activity level may be set basedon the criterion that any of the MC modes 1 to 7 shown in FIGS. 3A to 3Gshould be selected.

Furthermore, this smoothing filter processing can also be constituted sothat same can be turned ON/OFF in frame units. The processing itself ofthe smoothing filter section 124 is processing to change predictionimage data selected optimally in MC block units, and therefore thisprocessing can also have an adverse effect as well as a good effect onthe coding efficiency. Thus, image analysis in frame units is performedby the image coding apparatus. It is judged beforehand whether or notmotion causing discontinuity between MC blocks is present, and thesmoothing filter section 124 is turned ON when discontinuity isgenerated and turned OFF in the absence of discontinuity.

Examples of image analysis include the evaluation of a provisionalresidual between the input image signal 101 and the prediction image 106a. The signal distribution of the residual is viewed and, becauseresidual coding processing does not require smoothing filter processingin frames that are not particularly disadvantageous, the filter isturned OFF, but the filter is turned ON for frames that aresignificantly disadvantageous. For example, consideration may be givento operation such that in cases where the proportion of the residualsignal amount at the MC boundaries in relation to the overall residualsignal amount is equal to or more than a certain fixed threshold value,the filter is turned ON, and when this proportion is equal to or lessthan a threshold value, the filter is turned OFF. Alternatively, thereare also methods in which a determination of whether to turn the filterON or OFF is made after the frame unit coding efficiency has beencompared in cases where smoothing processing is and is not performed.The result of the ON/OFF determination is transmitted as a portion (bitdata that indicates the presence or absence of smoothing) of the headerinformation of the start of a frame in the compressed stream 114. Bymeans of such a constitution, smoothing processing can be applied moreadaptively for an irregular image signal.

Second Embodiment

FIG. 12 is a block diagram showing the constitution of the image codingapparatus according to the second embodiment of the present invention,and FIG. 13 is a block diagram showing the constitution of the imagedecoding apparatus thereof. The second embodiment relates to anapparatus constituted by introducing the smoothing filter of the presentinvention described above to an image coding and decoding apparatusaccording to a compression coding system that applies the techniqueknown as Matching Pursuits. Image coding systems that use MatchingPursuits include that disclosed by R. Neff et al, “Very Low Bit-rateVideo Coding Based on Matching Pursuits”, IEEE Trans. on CSVT, vol. 7,pp. 158-171, February 1997. With Matching Pursuits, a predictionresidual image signal f to be encoded can be rendered as per thefollowing formula by using an over-complete basis set G prepared inadvance that comprises n types of basis g_(k)εG (1≦k≦n).

$\begin{matrix}{f = {( {\sum\limits_{i = 0}^{m - 1}{\langle {s_{i\;},g_{ki}} \rangle g_{ki}}} ) + r_{m}}} & (1)\end{matrix}$

Here, m is the total number of basis search steps, i is the basis searchstep number, and r_(i) is the prediction residual image signal followingcompletion of the basis search of the (i−1)th step, this signal beingwithout further processing the prediction residual image signal for thebasis search of the ith step, where r₀=f. Further, s_(i) and g_(ki) arethe partial region and basis respectively, these being obtained byselecting, in the basis search of the ith step, a combination of s andg_(k) such that the inner product value thereof is maximized, fromoptional partial regions s (partial regions in a frame) of r_(i), aswell as optional bases g_(k) contained in the basis set G. If the basissearch is performed thus, the larger the number m of basis search steps,the less energy r_(m) diminishes. This means that the greater the numberof bases used in the rendition of the prediction residual image signalf, the better the signal can be rendered.

In each of the basis search steps, the data that is encoded is:

-   1) The index expressing g_(ki) (g_(k) is shared and maintained on    the encoding side and the decoding side, which makes it possible to    specify a basis by converting only the index data).-   2) The inner product values <s_(i), g_(ki)> (correspond to the basis    coefficients), and-   3) s_(i) on-screen center position data p_(i)=(x_(i), y_(i))

A set of these parameters is collectively known as an atom. By means ofthis image signal rendition and encoding method, the number of encodedatoms is increased, that is, as the total number m of basis search stepsincreases, so too does the encoded volume, whereby distortion isreduced.

On the other hand, according to the image coding performed by MatchingPursuits in the above paper, MC is carried out independently fromMatching Pursuits, and atom extraction is performed with respect to theprediction residual signal. In this case, there is the possibility thatatoms will be extracted in positions extending over the MC block. Solong as a system is adopted in which MC is dependent on the blockstructure, there is the disadvantage that a discontinuous waveformbetween MC blocks as described in the first embodiment above remains inthe residual signal and thus a waveform which should not be encoded willbe encoded.

Conventionally, overlapped MC that considers the motion vectors ofneighboring MC blocks has been utilized as a measure to resolve theforegoing problem. However, overlapped MC references more numerousprediction values and performs calculations for the final predictionvalue by means of a weight sum and there is therefore the problem thatthe computational cost is high and it is not possible to performadaptive smoothing with respect to the pixel values in the MC blocks,which obscures the prediction image excessively. By performing adaptivesmoothing filter processing at the MC block boundaries as described inthe first embodiment, smoothing of the residual signal can be performedwithout obscuring the prediction image excessively.

In the image coding apparatus shown in FIG. 12, the input image signal201 is a temporal array of frame images which will subsequently embodythe frame image unit signal. However, the frame image to be encodedcorresponds to the current frame 601 shown in FIG. 21. The current frameis encoded by means of the following procedure.

First of all, the current frame is outputted to a motion detectionsection 202, and detection of the motion vectors 205 is performed bymeans of a procedure that is exactly the same as that of the motiondetection section 102 of the first embodiment above. However, the motiondetection section 202 divides the intra coding into that for the DCcomponent and that for the AC component. The result of encoding the DCcomponent is used as part of the prediction image and the AC componentis encoded as part of the prediction residual. This constitutesprocessing to obtain the prediction image batchwise in frame units inorder to use the Matching Pursuits.

Accordingly, when intra mode is selected in the motion detection section202, the corresponding macroblock prediction image is filled by an intraDC component which is encoded and locally decoded. The intra DCcomponent undergoes prediction from neighboring image data as well asquantization in a DC coding section 225, and is outputted to a variablelength decoding section 213 as encoded data 226 and multiplexed in acompressed stream 214.

A motion compensation section 207 uses the DC component as above forintra mode macroblocks, and, for inter mode macroblocks, uses motionvectors 205 to reference a local decoding image 204 in the frame memory203, whereby a prediction image 206 a for the current frame is obtained.Although the motion detection section 202 and the motion compensationsection 207 perform processing for each of the macroblocks, thedifferential signal with respect to the input image signal 201 (theprediction residual signal 208) is obtained by taking the frame as theunit. That is, the motion vectors 205 of individual macroblocks aremaintained over the entire frame, whereby the prediction image 206 a isconstituted as a frame-unit image.

Next, the smoothing filter section 224 performs smoothing processingbetween the MC blocks of the prediction image 206 a. The operation of asmoothing filter section 224 uses coding mode data 223 and motionvectors 205 and is implemented by means of processing like that in thefirst embodiment. The smoothed prediction image 206 b is subtracted fromthe input image signal 201 by a subtraction section 241 to obtain aprediction residual signal 208.

Next, the atom extraction section 209 generates atom parameters 210 onthe basis of the above-described Matching Pursuits algorithm, withrespect to the prediction residual signal 208. A basis set g_(k) 211 isstored in a basis codebook 210. If, based on the properties of theMatching Pursuits algorithm, a basis which can render the partial signalwaveform as accurately as possible can be found in an initial searchstep, the partial signal waveform can be rendered by fewer atoms, thatis, with a small encoded volume. Atoms are extracted over the whole areaof the frame. For the coding of the position data in the atomparameters, making use of the fact that the atom coding order does notinfluence the decoding image, sorting is performed such that the atomsare aligned in order using two-dimensional co-ordinates with the topleft-hand corner of the frame as the starting point, and the codingorder is constructed so that the atoms are counted in macroblock units.The macroblock units are therefore constituted such that atom parameters212 (the respective basis index, position data, and basis coefficient)are coded in proportion to the number of atoms contained in themacroblock units.

An atom decoding section 215 decodes a local decoding residual signal216 from the atom parameters 212 and then obtains a local decoding image217 by adding the local decoding residual signal 216 to the smoothedprediction image 206 b by means of an addition section 242. The localdecoding image 217 is stored in the frame memory 203 in order to be usedin the MC for the next frame.

Next, the image decoding apparatus will be described by referring toFIG. 13. After the compressed stream 214 has been received by the imagedecoding apparatus, a variable length decoding section 229 detects async word that indicates the start of each frame, whereupon coding modedata 223, motion vectors 205 and atom parameters 212 are decoded inmacroblock units. The motion vectors 205 are outputted to the motioncompensation section 207 and the output 206 a is inputted to thesmoothing filter section 224, whereby the prediction image 206 b isobtained. The atom parameters 212 are decoded by the atom decodingsection 215. A basis is extracted by supplying a basis index to thebasis codebook 210. The output 216 of the atom decoding section 215 isadded to the prediction image 206 b by means of an addition section 243to produce the decoding image 217. The decoding image 217 is used in theMC for subsequent frames, and is therefore stored in a frame memory 230.The decoding image 217 is outputted to a display device (not shown) withpredetermined display timing, whereby the image is played back.

Therefore, according to the image coding apparatus and the imagedecoding apparatus of the second embodiment, results similar to thosefor the first embodiment above can be obtained also for an image codingand decoding apparatus according to a compression coding system thatapplies the technique known as Matching Pursuits.

Third Embodiment

A third embodiment of the present invention will now be described. Thethird embodiment describes another smoothing filter section. Thissmoothing filter section is a modification of the smoothing filtersections 124 and 224 described in the above first and second embodimentsrespectively, and because this filter simply substitutes for thesmoothing filter sections 124 and 224, this filter can be applied to theimage coding apparatus and image decoding apparatus shown in FIGS. 1 and2 or FIGS. 12 and 13 respectively. The internal constitution is also thesame as that in FIG. 5.

With the smoothing filter section according to the third embodiment, theblock activity level calculation section 125 does not define the blockactivity level information with respect to the blocks but insteaddefines this information with respect to the block boundaries.Consequently, the filter can be controlled by uniquely allocating anactivity level without the selection of an activity level used incircumstances where the activity level differs between blocks as wasdescribed in the first and second embodiments.

The activity level is defined with respect to block boundaries andtherefore, as shown in FIG. 14, activity levels S_(L)(C) and S_(U)(C)are defined for two boundaries to the left and above one block (‘C’here) respectively. In the determination of S_(L)(C), the activity levelis found based on the relationship with block B to the left, and whenS_(U)(C) is determined, the activity level is found based on therelationship with block A above.

The boundaries between blocks D and E are determined as S_(L)(D) andS_(U)(E) respectively. As indicated in the first embodiment, the methodof determining the activity level is determined by the motion vectordifference between two blocks and by a difference in coding modetherebetween, and can therefore be determined using setting rules likethose for the first embodiment.

Therefore, according to the smoothing filter section of the thirdembodiment, the filter can be controlled by uniquely allocating anactivity level without the selection of an activity level used incircumstances where the activity level differs between blocks as wasdescribed in the first and second embodiments.

Furthermore, in the third embodiment, because the determination of theactivity level is dependent on the blocks to the left and above alone,the apparatus, which generates the prediction image in macroblock unitsand encodes and decodes this image, is also able to carry out encodingand decoding processing while performing smoothing of the MC blocks.Further, by introducing pipeline processing in macroblock units,implementation that enables rapid and efficient processing is possiblefor the image coding apparatus and image decoding apparatus.

Fourth Embodiment

A fourth embodiment of the present invention will now be described. Thefourth embodiment describes another smoothing filter section. Thissmoothing filter section is a modification of the smoothing filtersections 124 and 224 described in the above first and second embodimentsrespectively, and because this filter simply substitutes for thesmoothing filter sections 124 and 224, this filter can be applied to theimage coding apparatus and image decoding apparatus shown in FIGS. 1 and2 or FIGS. 12 and 13 respectively. The internal constitution is also thesame as that in FIG. 5.

The smoothing filter section of the fourth embodiment switches thefilter characteristics in accordance with the activity level. FIGS. 15Aand 15B illustrate a case where a 5-tape filter which has the pixel r₁to be filtered at the center and which uses two pixels to the left andright thereof respectively, is applied. As shown in FIG. 15B, in a casewhere the activity level is high (S=2) and there is a desire to raisethe extent of the smoothing still further, a filter that increases theinfluence of the nearby pixels in the filter window is applied, andconversely, in a case where the activity level is low (S=1) and there isa desire to suppress the excessive loss of detail caused by smoothing, afilter with which the pixel itself has considerable influence isapplied, as shown in FIG. 15A.

Therefore, according to the smoothing filter section of the fourthembodiment, the extent of the smoothing according to the activity levelcan be controlled.

Further, the constitution may be such that, in the switching of thefilter characteristics, a plurality of characteristics can be selectedin accordance with the activity level, and such that informationidentifying the characteristics is multiplexed in a compressed stream114 and transmitted to the image decoding apparatus. By mean of such aconstitution, a more detailed adaptive judgment based on image analysison the image coding apparatus side can be reflected in the filtercharacteristics, and the image decoding apparatus can thus implementadaptive smoothing filter processing without performing special imageanalysis processing as implemented by the image encoding apparatus. Thefourth embodiment is equally applicable in cases of using an activitylevel which is defined for block boundaries as described in the thirdembodiment.

When the filter characteristics are switched, the type of filtercharacteristics used is transmitted as part of the header information atthe start of the frame in the compressed stream, for example.

Fifth Embodiment

FIG. 16 is a block diagram showing the constitution of the image codingapparatus according to the fifth embodiment of the present invention,and FIG. 17 is a block diagram showing the constitution of the imagedecoding apparatus thereof. However, those parts of the image codingapparatus of the fifth embodiment shown in FIG. 16 which correspond toparts of the first embodiment in FIG. 1 have been assigned the samereference symbols, and those parts of the image decoding apparatus ofthe fifth embodiment shown in FIG. 17 which correspond to parts of thefirst embodiment in FIG. 2 have been assigned the same referencesymbols, and hence a description of all these parts is omitted here.

The image coding apparatus shown in FIG. 16 and the image decodingapparatus shown in FIG. 17 differ from those of the first embodimentinsofar as another smoothing filter section 524 is employed in place ofthe smoothing filter section 124. The internal constitution of thesmoothing filter section 524 is shown in FIG. 18 and a flowchartillustrating the operation of a block activity level calculation section525 in the smoothing filter section 524 is shown in FIG. 19.

That is, this smoothing filter section 524 is a modification of thesmoothing filter sections 124 and 224 described in the above first andsecond embodiments respectively, and, with the exception of theinputting of the reference image 104, can simply substitute for thesefilters. In the fifth embodiment, the difference is obtained between theprediction image 106 a prior to smoothing filter processing, and thereference image 104 which is in the frame memory 103 and from which theprediction image 106 a originated, and filter control is performed onthe basis of the corresponding error margin electric power.

The prediction image 106 a is image data extracted from the referenceimage 104 using the motion vectors 105, and is image data thatapproximates the input image signal 101 inputted to the image codingapparatus. In other words, when points that are spatially the same inthe reference image 104 and the prediction image 106 a are compared, theerror margin electric power is large in parts with movement, and inparts with very little movement, the error margin electric power issmall. The magnitude of the motion vectors 105 does to some extentexpress the movement amount, but primary factors that are not dependenton a change to the image, such as noise, also influence detection, andtherefore the extent and intensity of the movement cannot be adequatelyexpressed by this magnitude alone. However, the above error marginelectric power can be used as an indicator for the intensity of themovement, whereby the adaptability of the filter control can beimproved. Further, the reference image 104 can use exactly the same dataon the encoding and decoding sides and therefore, when introducing thiscontrol, implementation is possible without transmitting specialidentification information to the decoding apparatus.

Specifically, as shown in FIG. 18, a reference image 104 and aprediction image 106 a are inputted to the block activity levelcalculation section 525, and the error margin electric power between thereference image 104 and the prediction image 106 a is found for eachblock. As shown in step ST14 in FIG. 19, in order to reduce the amountof extra computation here, the evaluation based on the error marginelectric power is skipped at points where it is judged that the activitylevel is zero and there is no movement. This is because in cases wherethe motion vector difference mvd (X, Y) is zero, spatial continuity ismaintained even though this may be a part with intense movement, meaningthat it is not necessary to perform smoothing.

In cases where the activity level at least is more than zero, theactivity level is not evaluated by means of motion vectors alone.Instead, the error margin electric power thus found is used, and whensame is greater than a predetermined threshold value, the activity levelis changed toward a larger value, and when the error margin electricpower is smaller than a predetermined threshold value, the activitylevel is set to zero and smoothing is not performed (step ST15). At suchtime, the threshold value in the direction of raising the activity leveland the threshold value in the direction of lowering the activity levelneed not necessarily be the same.

Further, in the fifth embodiment, as far as the reference image 104 isconcerned, the constitution may be such that average values in blocksare precalculated and buffered in evaluation block units before thisimage is stored in the frame memory, and average values are similarlyfound for the prediction image 106 a, whereby the error margin electricpower evaluation is performed using only average values.

Because the average values of the error margin amounts between thereference image 104 and the prediction image 106 a are controllingcomponents and the average values alone can be stored in a small buffer,the frequency of access to the frame memory during an activity levelcalculation can be reduced without affecting the judgment of theactivity level.

Furthermore, when the activity level is allocated at the blockboundaries as is the case in the third embodiment above, theconstitution can also be such that partial regions that extend acrossthe block boundaries are defined and the error margin amount between thereference image 104 and the prediction image 106 a is evaluated in theseunits.

In addition, in the fifth embodiment, an error margin amount between thereference image 104 and the prediction image 106 a is used to update theactivity level but may also be used to change the filter characteristicsapplied for points possessing a certain predetermined activity levelvalue. For example, in cases where the activity level of a certain blockor block boundary is an intermediate value in a defined activity levelrange, as the filter characteristics at this time are changed inaccordance with the conditions, the adaptability increases stillfurther. In order to achieve this object, a constitution is alsopossible in which an evaluation is adopted in which the error marginamount between the reference image 104 and the prediction image 106 a isswitched.

Therefore, by means of the smoothing filter section of the fifthembodiment, the adaptability of the filter control can be improved asdescribed above, and the reference image 104 is able to use exactly thesame data on the encoding and decoding sides, and therefore, whenintroducing this control, implementation is possible withouttransmitting special identification information to the decodingapparatus. Further, the frequency of access to the frame memory duringan activity level calculation can be reduced without affecting thejudgment of the activity level.

INDUSTRIAL APPLICABILITY

The present invention can be used as an image coding apparatus and animage decoding apparatus applied to a mobile image transmission system,for example.

1. An image coding apparatus comprising: motion compensation predictingmeans for generating a motion-compensated prediction image by detectingmovement amounts in predetermined first partial image regions of aninput image with respect to a reference image; smoothing means forperforming smoothing of pixels located at the boundaries of adjoiningimages of the first partial image regions, with respect to themotion-compensated prediction image obtained by the motion compensationpredicting means; and prediction residual coding means for coding aprediction residual signal obtained from a difference between the inputimage and the smoothed prediction image, wherein the smoothing meansfurther comprises, activity level setting means for determining anactivity level of second partial image regions of the input image;adaptive smoothing means for establishing an intensity of the smoothingon a basis of the activity level set with said activity level settingmeans, and for performing smoothing of pixels located at the boundariesbetween adjoining images of the first partial image regions; andpost-processing means for performing suppression processing of a resultof smoothing by using a threshold value established in accordance withsaid activity level, wherein the first partial image regions are largerin size than the second partial image regions, so that at least onesecond partial image region fits into one of the first partial imageregions, and at least one of the second partial image regions overlap aboundary of two adjacent first partial image regions when determiningsaid activity level by the activity level setting means, and the secondpartial image regions are arranged in a cross-like pattern, having acentral block with two blocks arranged above and below the centralblock, and two other blocks arranged to the left and the right of thecentral block.
 2. The image coding apparatus as defined in claim 1wherein the motion compensation predicting means detects the movementamount by using encoded local decoding image data as the referenceimage.
 3. The image coding apparatus as defined in claim 1 wherein thesmoothing means performs the smoothing in accordance with a differencevalue for the movement amount between the adjoining images of the firstpartial image regions.
 4. The image coding apparatus as defined in claim1 wherein the smoothing means performs the smoothing in accordance witha difference in a coding method between the adjoining images of thefirst partial image regions.
 5. The image coding apparatus as defined inclaim 1 wherein the smoothing means performs the smoothing based onwhether the adjoining images of the first partial image regions havebeen predicted by bidirectional prediction, by unidirectionalprediction, or by I-frame prediction.
 6. The image coding apparatus asdefined in claim 1 wherein the smoothing means performs the smoothing inaccordance with an error margin amount between the prediction imageprior to smoothing, and the reference image from which the predictionimage obtained by the motion compensation predicting means is generated.7. The image coding apparatus as defined in claim 1, wherein saidactivity level of the second partial image regions is determined on abasis of at least one of: a difference value for the movement amountbetween the adjoining images of the first partial image regions, adifference in coding method between the adjoining images of the firstpartial image regions, a difference in image prediction method betweenthe adjoining images of the first partial image regions, and an errormargin amount between the prediction image prior to smoothing, and thereference image from which the prediction image obtained by the motioncompensation predicting means is generated.
 8. The image codingapparatus as defined in claim 7 wherein the activity level setting meanssets the activity level for individual second partial image regions andinputs to the adaptive smoothing means the larger value of values forthe activity level obtained by comparing the first partial image regionswhich are subjected to smoothing.
 9. The image coding apparatus asdefined in claim 7 wherein the activity level setting means sets theactivity level for boundaries between the first partial image regionsand inputs the activity level to the adaptive smoothing means.
 10. Theimage coding apparatus as defined in claim 7 wherein the adaptivesmoothing means changes the number of pixels subjected to smoothing inaccordance with the activity level obtained by the activity levelsetting means.
 11. The image coding apparatus as defined in claim 7wherein the adaptive smoothing means switches filter characteristics forperforming the smoothing in accordance with the activity level obtainedby the activity level setting means.
 12. The image coding apparatus asdefined in claim 11 wherein the adaptive smoothing means encodes andtransmits a bit that indicates a type of filter characteristics forperforming the smoothing.
 13. The image coding apparatus as defined inclaim 1 wherein a bit that indicates a presence or absence of smoothingby the smoothing means is encoded and transmitted.
 14. An image decodingapparatus comprising: motion compensation predicting means forgenerating a motion-compensated prediction image by detecting movementamounts in predetermined first partial image regions with respect to areference image; smoothing means for performing smoothing of pixelslocated at the boundaries of adjoining images of the first partial imageregions, with respect to the motion-compensated prediction imageobtained by the motion compensation predicting means; predictionresidual decoding means for decoding a prediction residual signal froman encoding apparatus; and adding means for obtaining a decoded image byadding a decoded prediction residual signal obtained by the predictionresidual decoding means, and a smoothed prediction image from thesmoothing means, wherein the smoothing means further comprises, activitylevel setting means for determining an activity level of second partialimage regions; adaptive smoothing means for establishing an intensity ofthe smoothing on a basis of the activity level set with said activitylevel setting means, and for performing smoothing of pixels located atthe boundaries between adjoining images of the first partial imageregions; and post-processing means for performing suppression processingof a result of smoothing by using a threshold value established inaccordance with said activity level, wherein the first partial imageregions are larger in size than the second partial image regions, sothat at least one second partial image region fits into one of the firstpartial image regions, and the second partial image regions overlap aboundary of two adjacent first partial image regions when determiningsaid activity level by the activity level setting means, and the secondpartial image regions are arranged in a cross-like pattern, having acentral block with two blocks arranged above and below the centralblock, and two other blocks arranged to the left and the right of thecentral block.
 15. The image decoding apparatus as defined in claim 14wherein the motion compensation predicting means obtains the predictionimage by using decoded local decoding image data as the reference image.16. The image decoding apparatus as defined in claim 14 wherein thesmoothing means performs the smoothing in accordance with a differencevalue for the movement amount between the adjoining images of the firstpartial image regions.
 17. The image decoding apparatus as defined inclaim 14 wherein the smoothing means performs the smoothing inaccordance with a difference in a decoding method between the adjoiningimages of the first partial image regions.
 18. The image decodingapparatus as defined in claim 14 wherein the smoothing means performsthe smoothing based on whether the adjoining images of the first partialimage regions have been predicted by bidirectional prediction, byunidirectional prediction, or by I-frame prediction.
 19. The imagedecoding apparatus as defined in claim 14 wherein the smoothing meansperforms the smoothing in accordance with an error margin amount betweenthe prediction image prior to smoothing, and the reference image fromwhich the prediction image obtained by the motion compensationpredicting means is generated.
 20. The image decoding apparatus asdefined in claim 14, wherein said activity level of the second partialimage regions is determined on a basis of at least one of: a differencevalue for the movement amount between the adjoining images of the firstpartial image regions, a difference in coding method between theadjoining images of the first partial image regions, a difference inimage prediction method between the adjoining images of the firstpartial image regions, and an error margin amount between the predictionimage prior to smoothing, and the reference image from which theprediction image obtained by the motion compensation predicting means isgenerated.
 21. The image decoding apparatus as defined in claim 20wherein the activity level setting means sets the activity level forindividual second partial image regions and inputs to the adaptivesmoothing means the larger value of values for the activity levelobtained by comparing the first partial image regions which aresubjected to smoothing.
 22. The image decoding apparatus as defined inclaim 20 wherein the activity level setting means sets the activitylevel for boundaries between the first partial image regions and inputsthe activity level to the adaptive smoothing means.
 23. The imagedecoding apparatus as defined in claim 20 wherein the adaptive smoothingmeans changes the number of pixels subjected to smoothing in accordancewith the activity level obtained by the activity level setting means.24. The image decoding apparatus as defined in claim 20 wherein theadaptive smoothing means switches filter characteristics for performingthe smoothing in accordance with the activity level obtained by theactivity level setting means.
 25. The image decoding apparatus asdefined in claim 24 wherein the adaptive smoothing means switches thefilter characteristics on a basis of a bit that indicates a type offilter characteristics for performing the smoothing, the bit beingdecoded from compressed input data from the encoding apparatus.
 26. Theimage decoding apparatus as defined in claim 14 wherein the smoothingprocessing is controlled on a basis of a bit that indicates a presenceor absence of the smoothing, the bit being decoded from compressed inputdata.
 27. An image coding method performed on a image coding apparatus,comprising: a motion compensation predicting step performed on a motiondetection unit of the image coding apparatus of generating amotion-compensated prediction image by detecting movement amounts inpredetermined first partial image regions of an input image with respectto a reference image stored in a frame memory; a smoothing stepperformed on a filter unit of the image coding apparatus of performingsmoothing of pixels located at the boundaries of adjoining images of thefirst partial image regions, with respect to the motion-compensatedprediction image obtained by the motion compensation predicting step;and a prediction residual coding step of coding a prediction residualsignal obtained from the difference between the input image and thesmoothed prediction image, wherein the smoothing step further comprises,an activity level setting step of determining an activity level ofsecond partial image regions; an adaptive smoothing step of establishingan intensity of the smoothing on a basis of the activity level set withsaid activity level setting means, and of performing smoothing of pixelslocated at the boundaries between adjoining images of the first partialimage regions; and a post-processing step of performing suppressionprocessing of a result of the smoothing by using a threshold valueestablished in accordance with the activity level, wherein the firstpartial image regions are larger in size than the second partial imageregions, so that at least one second partial image region fits into oneof the first partial image regions, and the second partial image regionsoverlap a boundary of two adjacent first partial image regions whendetermining said activity level by the activity level setting step, andthe second partial image regions are arranged in a cross-like pattern,having a central block with two blocks arranged above and below thecentral block, and two other blocks arranged to the left and the rightof the central block.
 28. The image coding method performed on a imagecoding apparatus as defined in claim 27 wherein the motion compensationpredicting step detects the movement amount by using encoded localdecoding image data as the reference image.
 29. The image coding methodperformed on a image coding apparatus as defined in claim 27 wherein thesmoothing step performs the smoothing in accordance with a differencevalue for the movement amount between the adjoining images of the firstpartial image regions.
 30. The image coding method performed on a imagecoding apparatus as defined in claim 27, wherein the smoothing stepperforms the smoothing based on whether the adjoining images of thefirst partial image regions have been predicted by bidirectionalprediction, by unidirectional prediction, or by I-frame prediction. 31.The image coding method performed on a image coding apparatus as definedin claim 27 wherein the smoothing step performs the smoothing inaccordance with a difference in a image prediction method between theadjoining images of the first partial image regions.
 32. The imagecoding method performed on a image coding apparatus as defined in claim27 wherein the smoothing step performs the smoothing in accordance withan error margin amount between the prediction image prior to smoothing,and the reference image from which the prediction image obtained by themotion compensation predicting step is generated.
 33. The image codingmethod performed on a image coding apparatus as defined in claim 27,wherein said activity level of the second partial image regions isdetermined on a basis of at least one of: a difference value for themovement amount between the adjoining images of the first partial imageregions, a difference in coding method between the adjoining images ofthe first partial image regions, a difference in image prediction methodbetween the adjoining images of the first partial image regions, and anerror margin amount between the prediction image prior to smoothing, andthe reference image from which the prediction image obtained by themotion compensation predicting step is generated.
 34. The image codingmethod performed on a image coding apparatus as defined in claim 33wherein the activity level setting step sets the activity level forindividual second partial image regions and inputs to the adaptivesmoothing step the larger value of values for the activity levelobtained by comparing the first partial image regions which aresubjected to smoothing.
 35. The image coding method performed on a imagecoding apparatus as defined in claim 33 wherein the activity levelsetting step sets the activity level for boundaries between the firstpartial image regions and inputs the activity level to the adaptivesmoothing step.
 36. The image coding method performed on a image codingapparatus as defined in claim 33 wherein the adaptive smoothing stepchanges the number of pixels subjected to smoothing in accordance withthe activity level obtained by the activity level setting step.
 37. Theimage coding method performed on a image coding apparatus as defined inclaim 33 wherein the adaptive smoothing step switches filtercharacteristics for performing the smoothing in accordance with theactivity level obtained by the activity level setting step.
 38. Theimage coding method performed on a image coding apparatus as defined inclaim 37 wherein the adaptive smoothing step encodes and transmits a bitthat indicates a type of filter characteristics for performing thesmoothing.
 39. The image coding method performed on a image codingapparatus as defined in claim 27 wherein a bit that indicates a presenceor absence of smoothing by the smoothing step is encoded andtransmitted.
 40. An image decoding method performed on a image codingapparatus comprising: a motion compensation predicting step performed ona motion detection unit of the image coding apparatus of generating amotion-compensated prediction image by detecting movement amounts inpredetermined first partial image regions with respect to a referenceimage stored in a frame memory; a smoothing step performed on a filterunit of the image coding apparatus of performing smoothing of pixelslocated at the boundaries of adjoining images of the first partial imageregions, with respect to the motion-compensated prediction imageobtained by the motion compensation predicting step; a predictionresidual decoding step of decoding a prediction residual signal from anencoding apparatus; and an adding step of obtaining a decoded image byadding a decoded prediction residual signal obtained by the predictionresidual decoding step, and the smoothed prediction image obtained bythe smoothing step, wherein the smoothing step further comprises, anactivity level setting step of determining an activity level of secondpartial image regions; an adaptive smoothing step of establishing anintensity of the smoothing on a basis of the activity level set withsaid activity level setting step, and of performing smoothing of pixelslocated at the boundaries between adjoining images of the first partialimage regions; and a post-processing step of performing suppressionprocessing of a results of smoothing by using a threshold valueestablished in accordance with said activity level, wherein the firstpartial image regions are larger in size than the second partial imageregions, so that at least one second partial image region fits into oneof the first partial image regions, and the second partial image regionsoverlap a boundary of two adjacent first partial image regions whendetermining said activity level by the activity level setting step, andthe second partial image regions are arranged in a cross-like pattern,having a central block with two blocks arranged above and below thecentral block, and two other blocks arranged to the left and the rightof the central block.
 41. The image decoding method performed on a imagecoding apparatus as defined in claim 40 wherein the motion compensationpredicting step obtains the prediction image by using decoded localdecoding image data as the reference image.
 42. The image decodingmethod performed on a image coding apparatus as defined in claim 40wherein the smoothing step performs the smoothing in accordance with adifference value for the movement amount between the adjoining images ofthe first partial image regions.
 43. The image decoding method performedon a image coding apparatus as defined in claim 40 wherein the smoothingstep performs the smoothing based on whether the adjoining images of thefirst partial image regions have been predicted by bidirectionalprediction, by unidirectional prediction, or by I-frame prediction. 44.The image decoding method performed on a image coding apparatus asdefined in claim 40 wherein the smoothing step performs the smoothing inaccordance with a difference in image prediction method between theadjoining images of the first partial image regions.
 45. The imagedecoding method performed on a image coding apparatus as defined inclaim 40 wherein the smoothing step performs the smoothing in accordancewith an error margin amount between the prediction image prior tosmoothing, and the reference image from which the prediction imageobtained by the motion compensation predicting step is generated. 46.The image decoding method performed on a image coding apparatus asdefined in claim 40, wherein said activity level of the second partialimage regions is determined on a basis of at least one of: a differencevalue for the movement amount between the adjoining images of the firstpartial image regions, a difference in coding method between theadjoining images of the first partial image regions, a difference inimage prediction method between the adjoining images of the firstpartial image regions, and an error margin amount between the predictionimage prior to smoothing, and the reference image from which theprediction image obtained by the motion compensation predicting step isgenerated.
 47. The image decoding method performed on a image codingapparatus as defined in claim 46 wherein the activity level setting stepsets the activity level for individual second partial image regions andinputs to the adaptive smoothing step the larger value of values for theactivity level obtained by comparing the first partial image regionswhich are subjected to smoothing.
 48. The image decoding methodperformed on a image coding apparatus as defined in claim 46 wherein theactivity level setting step sets the activity level for boundariesbetween the first partial image regions and inputs the activity level tothe adaptive smoothing step.
 49. The image decoding method performed ona image coding apparatus as defined in claim 46 wherein the adaptivesmoothing step changes the number of pixels subjected to smoothing inaccordance with the activity level obtained by the activity levelsetting step.
 50. The image decoding method performed on a image codingapparatus as defined in claim 46 wherein the adaptive smoothing stepswitches filter characteristics for performing the smoothing inaccordance with the activity level obtained by the activity levelsetting step.
 51. The image decoding method performed on a image codingapparatus as defined in claim 50 wherein the adaptive smoothing stepswitches the filter characteristics on a basis of a bit that indicates atype of filter characteristics for performing the smoothing, the bitbeing decoded from compressed input data from the encoding apparatus.52. The image decoding method performed on a image coding apparatus asdefined in claim 40 wherein the smoothing processing is controlled on abasis of a bit that indicates a presence or absence of the smoothing,the bit being decoded from the compressed input data.
 53. The imagecoding apparatus as defined in claim 1, wherein the activity levelsetting means determines a direction of a prediction for the blocksabove, below, left, right of the central block, and determines adirection of prediction for the central block, and determines if thedirection of prediction is the same for all the blocks.
 54. The imagedecoding apparatus as defined in claim 14, wherein the activity levelsetting means determines a direction of a prediction for the blocksabove, below, left, right of the central block, and determines adirection of prediction for the central block, and determines if thedirection of prediction is the same for all the blocks.
 55. The imagecoding method performed on a image coding apparatus as defined in claim27, wherein the activity level setting step determines a direction of aprediction for the blocks above, below, left, right of the centralblock, and determines a direction of prediction for the central block,and determines if the direction of prediction is the same for all theblocks.
 56. The image decoding method performed on a image codingapparatus as defined in claim 40, wherein the activity level settingstep determines a direction of a prediction for the blocks above, below,left, right of the central block, and determines a direction ofprediction for the central block, and determines if the direction ofprediction is the same for all the blocks.